Sec 2. Basic concepts
1)physical systems、自由度、不完全驱动、非完整约束、模式切换、分段连续;

2)interactive perception

  • in order for: estimate property; predict effects of action;
  • can be used as:self-supervised learning
  • how to do:active learning(贯穿多处,可以用于转移模型与policy的学习)

3)Hierarchical Task Decompositions and Skill Reusability
自上而下层层分解任务【化繁为简】,技能重用

4)object-centric generalization

generalization via objects—both across different objects, and between similar (or identical) objects in different task instances

Note: 实际上较为困难,就是要在不同的物体上进行泛化。e.g. 在机器人层面考虑用柔性抓适应各种物体,或者在object层面抽象出general级别的rerpesentation。

Sec 3.形式化结构Formalizaiton
目的: 总概整个task family:

A task family is a distribution, P(M), over MDPs, each of which is a task.

Mi=(Si,A,Ri,Ti,γ,τ)M_i = (S_i,A,R_i,T_i,\gamma,\tau)

note:skill: higher-level actions modeled\stackrel{modeled}{\longrightarrow} option:o=(Io,βo,πo)o=(I_o,\beta_o,\pi_o)

Sec 4. 定义与学习状态空间 define and learn state and context space
1)object representaion:
A.简介:within-task or across-task(context)
B.具体类型:pose、shape、material、interaction or relative property
C.HIERARCHIES:point;part;object level(底层->高层整体)
e.g.

  • pixel level(contact point、segmentation);
  • a mug can be seen as having an opening for pouring, a bowl for containing, a handle for grasping, and a bottom for placing;
  • block stack (方块的堆叠) groups of objects;

2)method:passive and interacive perception
e.g. camera、human immitation V.S. interaction by sensor

3)steps:discover object;ensure freedom;estimate object property
Note:active learning approaches are often used to select informative actions for quickly determining the model parameters

Sec 5 .transition model
1)General form
  A deterministic function T:S×AST:S×A \longrightarrow S or a stochastic distribution T:S×A×SRT:S×A×S \longrightarrow R

2)Types:continous; discrete; hybrid model

The discrete components of the state are often used to capture high-level task information while the continuous components capture low-level state information.

Key pt:continous model
My view:e.g. action 6dof(x,y,z,rx,ry,rz) -> continuous;state:object pose 同理

3)随机性(开门不一定开的成功)和不确定性(多点额外数据信息即可)

4)how to learn: Self-supervision and Exploration
sample: act then observe effect to get(s, a, s')

• Random sampling.
Active sampling approaches can be used to select action samples that are the most informative .
Intrinsic motivation: actively attempts to discover novel scenarios where its model currently performs poorly or that result in salient events.